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Preface _ 

When  you  make  a  thing,  a  thing 

that  is  new,  it  is  so  complicated 
making  it 

that  it  is  bound  to  be  ugly. 

But  those  that  make  it  after  you, 
they  don’t  have  to  worry 

about  making  it. 

And  they  can  make  it  pretty,  and 
so  everybody  can  like  it 
when  the  others 
make  it  after  you. 

Picasso  (as  quoted  by  Gertrude  Stein) 

[From  Victor  Papanek  (1982),  Design  for  the  Real  World , 
London:  Granada  Publishing,  p.  131.] 


Program  Visualization  Project  Final  Report  Page  v 

Hu-T.ar.  Computing  Resources  Theory.  Results. 

'Aa  «.us  anc  Associates  Conclusions 


page  Table  of  Contents 


1  Chapter  1:  Introduction 
3  Our  Approach 

5  Programs  as  Publications 

6  The  Goal  of  Our  Research 

7  Methodology  of  Our  Research 

9  The  Final  Report  and  the  Deliverables 

11  Chapter  2:  An  Example  of  the  Design  of  Program  Appearance 

28  Chapter  3:  C  Program  Books 

30  Secondary  Text:  Front  Matter 

31  Tertiary  Text:  User  Documentation 

32  Primary  Text:  The  Program 

33  Secondary  Text:  Metadata  and  Commentaries 

34  Tertiary  Text:  Indices  and  Overviews 

35  Tertiary  Text:  Programmer  Documentation 


36 

38 

39 

41 

42 

44 

45 

46 

47 

49 

50 

51 
53 
58 


Chapter  4:  Graphic  Design  of  C  Source  Code  and  Comments 

The  Presentation  of  Program  Metadata 

The  Spatial  Composition  of  Comments 

The  Typography  of  Punctuation 

Typographic  Encodings  of  Token  Attributes 

The  Presentation  of  Preprocessor  Commands 

The  Presentation  of  Declarations 


The  Visual  Parsing  of  Expressions 
The  Visual  Parsing  of  Statements 
The  Presentation  of  Function  Definitions 
The  Presentation  of  Program  Structure 


Chapter  5:  Conclusions 
Chapter  6:  Future  Research 
Appendix  A:  Bibliography 


Accession  For 

Ttis  gra&i 

DTIC  TAB 

Unannounced 

Justification- 


Distribution/ 
Availability  Cedes 
”  [Avail  and/or 
Special 


iDist 


Final  Report: 
ill Results, 
i  oncmsions 


bUAk*J 


J  *V  v.  ha  1 


Program  Visualization  Project 
H'.iniHO  Computing  Rcsoutccs 
Aaro«  Marcus  and  Associates 


List  of  Figures 


15  A  listing  of  a  simple  desk  calculator  program  produced  on  a  dot 
matrix  line  printer 

19  A  listing  of  the  desk  calculator  program  produced  on  a  laser  printer 
23  The  desk  calculator  program  produced  on  a  laser  printer  using  the 
SEE  program  visualizer 
29  The  structure  of  a  program  book 


. -C-r.  -  N 


m m 


program  Visual*  ~?t;or  Project 
Human  Comnu*'*''*;  k€«oi**res 
\  a  r  An  V  arc  us  <\  nr»  Astociatfs 


Chapter  1 


Final  Report  Chapter  I:  Page  1 

Th  ->ry.  Results  Introduction 

r ouclus  .>ns 


Introduction 


The  continuous  and  spectacular  development  of  computer  hard¬ 
ware  that  has  occurred  over  the  past  four  decades  has  finally 
been  matched  in  recent  years  with  corresponding  advances  in 
software  engineering,  that  is,  in  the  technology  and  processes  of 
software  development. 

Typically,  efforts  have  been  made  on  a  number  of  fronts.  The 
most  widespread  development  has  been  the  concern  with  the  logi¬ 
cal  structure  and  expressive  style  of  programs.  Out  of  this  con¬ 
cern  have  emerged  many  of  the  modem  software  development 
techniques,  including  top-dowm  design  and  stepwise  refinement 
[Wirth,  1971],  structured  programming  [Dahl,  Dijkstra  &  Hoare, 

1972] ,  modularity  [Pamas,  1972],  and  software  tools  [Kemighan 
&  Plauger,  1976].  A  second  development  has  been  the  marked 
improvement  in  the  clarity  and  expressive  power  of  programming 
languages,  as  can  be  seen  for  example  in  Modula  [Wirth,  1977], 
Another  kind  of  development  has  occurred  in  the  organization 
and  management  of  the  team  that  produces  the  writing.  This  has 
given  rise,  for  example,  to  the  concepts  of  chief  programmer 
teams  [Baker,  1972]  and  structured  walkthroughs  [Yourdon,  1979]. 

The  above  advances  have  not  been  aided  by  progress  in  interac¬ 
tive  computer  graphics,  but  some  other  areas  have  benefited.  It  is 
now  possible  to  construct  interactive  editors  for  various  graphic 
notations  that  express  algorithms  and  data  structures,  for 
example,  Nassi-Schneiderman  diagrams  [Nassi  &  Schneiderman, 

1973] ,  Wamier-Orr  diagrams  [Higgins,  1979],  contour  diagrams 
[Organick  &  Thomas,  1974],  and  SADT  diagrams  [Ross,  1977],  (See 
[Martin  &  McClure,  1985]  for  a  recent  survey  of  these  diagramming 
schemes  and  notations.)  Even  more  significant  is  the  increasing 
interest  in  enhancing  the  technology  to  support  the  writing  and 
maintaining  of  good  programs  by  providing,  for  example,  integrated 
software  development  environments  [Wasserman,  1981]  such  as 
INTFRI.ISP  [Teitelman,  1979]  and  high-performance  personal  works¬ 
tations  specialized  to  the  task  of  program  development  [Gutz, 
Wasserman  &  Spier,  1981], 

Howr  have  these  developments  improved  the  daily  life  of  most  pro¬ 
grammers?  Almost  all  have  benefited  from  the  use  of  modem  pro¬ 
gramming  languages.  On  the  other  hand,  the  impact  of  new 
software  development  methodologies,  programmer  team  organiza¬ 
tions,  graphic  diagramming  notations,  and  sophisticated 
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programmer  development  environments  has  been  limited  for  the 
most  part  to  those  working  in  research  laboratories  and  in  large 
corporate  programming  shops.  Significant  assistance  has  not  yet 
been  available  to  the  lone  programmer  or  small  programming 
group  who  typically  work  in  BASIC  or  C  on  systems  of  moderate 
complexity. 
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Section  1.1  Our  Approach 


We  have  taken  a  different  approach  in  our  recent  work  [Marcus 
&  Baecker,  1982;  Baecker  &  Marcus,  1983].  We  focused  on  every 
programmer’s  vehicle  of  discourse:  the  program,  expressed  in 
some  computer  language  and  appearing  in  some  form  on  some 
physical  medium. 

Since  the  advent  of  programming,  the  technologies  of  the  video 
display  terminal  and  the  line  printer  have  limited  the  presentation 
of  a  computer  program’s  source  code  and  comments  to  the  use  of  a 
single  type  font,  at  a  single  point  size,  with  fixed-w  idth  charac¬ 
ters,  and  sometimes  w  ithout  even  the  use  of  upper  and  lower  case. 
The  technologies  of  high  resolution  bit-mapped  displays,  laser 
printers,  and  computer-driven  phototypesetters,  on  the  other  hand, 
allow  for  the  production  of  far  richer  representations,  embodying 
multiple  fonts,  non-alphanumeric  symbols,  variable  point  sizes, 
variable  character  w  idths.  proportional  character  spacing,  variable 
word  spacing  and  line  spacing,  gray  scale  tints,  rules,  and  arbi¬ 
trary  spatial  location  and  orientation  of  elements  on  a  page.  We 
therefore  explore  systematically  in  our  work  how  these  capabili¬ 
ties  can  be  used  to  enhance  the  art  of  program  presentation. 

Our  work  thus  encompasses  the  field  of  prettyprinting,  an  area  in 
which  others  before  us  have  worked  with  more  limited  graphics 
tools.  The  earliest  work  was  done  on  LISP,  so  that  program  readers 
would  not  drown  in  a  sea  of  parentheses.  The  problems  of  pretty¬ 
printing  PASCAL  have  elicited  a  long  correspondence  in  the  ACM 
SIGPLAN  notices  [Hueras  &  Ledgard.  1977;  Grogono.  1979;  Gustaf¬ 
son,  1979;  Leinbaugh,  1980],  A  discussion  of  prettyprinting  algo¬ 
rithms  and  their  complexity  has  appeared  [Oppen,  1980].  Other 
authors  [Rose  &  Welsh,  1977;  Rubin,  1983]  demonstrated  methods 
of  extending  the  syntactic  descriptions  of  programming  languages  to 
include  their  formatting  conventions.  One  paper  [Miara.  Mussel- 
man,  Navarro  &  Schneiderman,  1983]  includes  a  review  of  a  num¬ 
ber  of  human  factors  experiments  concerning  the  effect  of  program 
indentation  on  program  comprehensibility.  Unfortunately,  these 
experiments  have  generally  failed  to  provide  experimental  confir¬ 
mation  of  what  every  programmer  knows:  a  program’s  appearance 
dramatically  effects  its  comprehensibility  and  useabilitv. 

Our  work  however  goes  significantly  beyond  suggesting  recom¬ 
mended  conventions  for  appearance  that  enhance  the  prettyprinting 
of  program  code.  We  have  also  developed  a  flexible  tool  with 
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which  future  programmers  and  human  factors  specialists  may 
tune  and  improve  these  conventions,  thus  paving  the  way  for  suc¬ 
cessful  standards.  In  addition,  we  have  considered  the  entire  con¬ 
text  in  which  code  is  presented,  a  context  which  includes  the  sup¬ 
porting  texts  and  notations  that  make  a  program  a  living  piece  of 
written  communication. 
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Programs  as  Publications 


Programs  are  publications,  a  form  of  literature.  Just  as  English 
prose  can  range  in  scope  from  a  note  scribbled  on  a  pad  to  a  his¬ 
torical  treatise  appearing  in  multiple  volumes  and  representing  a 
lifetime  of  work,  so  do  we  find  a  variety  of  programs  ranging 
from  a  two  line  shell  script  created  whenever  needed  to  an  edition 
of  the  collected  program  works  of  a  laboratory,  as  is  the  case,  for 
example,  with  theUNIX  (tm)  operating  system.  (See  [Lions,  1977] 
for  an  early  example  of  this  idea  applied  to  the  UNIX  kernel.)  The 
line  printer  listing,  which  represents  the  output  of  conventional  pro¬ 
gram  publishing  technology,  is  woefully  inadequate  for  documenting 
an  encyclopedic  collection  of  code  such  as  the  UNIX  system,  or 
even  for  such  lesser  program  treatises  as  compilers,  graphics  subrou¬ 
tine  packages,  and  data  base  management  systems. 

What  we  have  done,  therefore,  is  to  apply  the  tools  of  modem  com¬ 
puter  graphics  technology  and  the  visible  language  skills  of  graphic 
design,  guided  by  the  metaphors  and  precedents  of  literature,  print¬ 
ing,  and  publishing,  to  suggest  and  demonstrate  in  prototype  form 
that  enduring  programs  should  and  can  be  made  more  accessible 
and  more  useable. 

We  divide  the  content  of  a  program  into  three  kinds  of  text:  pri¬ 
mary,  secondary,  and  tertiary.  Primary  text  includes  what  typi¬ 
cally  appears  in  a  program  listing:  the  program  code  and  comments. 
Secondary  text  includes  various  metadata  describing  the  context  in 
which  the  program  is  used  and  various  short  commentaries  (often 
mechanically  produced)  pointing  out  salient  features  of  the  pro¬ 
gram.  Tertiary  text  includes  the  various  longer  descriptions  and 
explanations  of  the  program  that  typically  are  called  documenta¬ 
tion. 


(tm)  UNIX  is  a  trademark  of  AT&T  Bell  Laboratories. 
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Our  goal  has  been  to  take  a  fresh  approach  to  the  presentation  of 
source  text,  and  thereby  to  make  it: 

—  more  legible 

—  more  readable 

—  more  intelligible 

—  more  vivid 

—  more  appealing 

—  more  memorable 

—  more  useful 


—  more  maintainable. 
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Section  1.4  Methodology  of  Our  Research 


Our  research  has  proceeded  as  follows: 

We  first  developed  a  graphic  design  taxonomy  for  computer-based 
documents  and  publications.  This  was  intended  to  be  a  checklist 
for  approaches  to  enhancing  source  code  presentation  [Gerstner, 
1978;  Ruder.  1973;  Chaparos,  1981]. 

We  simultaneously  developed  a  taxonomy  of  C  constructs,  a  sys¬ 
tematic  enumeration  and  classification  of  aspects  of  the  language 
[AT&T.  1985;  Kemighan  &  Ritchie.  1978;  Harbison  &  Steele,  1984]. 
This  was  intended  to  be  a  companion  checklist  for  insuring  com¬ 
pleteness  in  the  representation  of  C  source  text.  We  subsequently 
reworked  our  taxonomy  slightly  to  make  it  maximally  consistent 
with  the  presentation  in  [Harbison  &  Steele,  1984],  We  chose  to 
work  with  C  for  a  number  of  reasons:  its  commercial  importance, 
its  illegibility,  and  its  unreadability. 

Next,  we  collected  and  systematized  typical  mappings  from  C  con¬ 
structs  to  typographic  constructs,  examples  abstracted  from  real  C 
programs  prepared  by  typical  experienced  C  programmers. 

Because  these  examples  often  embody  real  design  insights  from 
non-designers,  we  call  them  "folk  designs”. 

Then,  we  developed  a  systematic  approach  to  the  design  of  map¬ 
pings  from  C  constructs  to  typographic  constructs,  an  approach  that 
forms  the  basis  for  detailed  visual  research  into  effective  presenta¬ 
tions  of  C  source  code.  We  shall  describe  the  approach  in  detail  in 
this  report  and  illustrate  it  via  an  application  to  a  concrete 
example. 

To  test  our  systematic  approach  to  the  design  of  program  presenta¬ 
tion,  we  constructed  SEE,  a  visual  C  compiler,  a  program  that  maps 
an  arbitrary  C  program  into  an  effective  typeset  representation  of 
that  program.  A  description  of  the  implementation  appears  in  \  ol- 
ume  6  of  the  report.  We  have  produced  numerous  examples  using 
this  automated  tool,  which  has  in  turn  enabled  us  to  improve  the 
graphic  design  of  program  appearance.  Some  of  the  examples  are 
collected  in  Volume  3  of  the  report.  The  final  specifications  were 
then  embodied  in  a  graphic  design  manual  for  the  appearance  of  C 
programs.  This  manual  is  Volume  2  of  the  report. 

Finally,  we  shifted  our  viewpoint  away  from  the  details  of  code 
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Section  1.5  The  Final  Report  and  the  Deliverables 


Volume  1:  Theory*  Results,  and  Conclusions 

This  volume  presents  the  theory,  summarizes  the  results,  and  sug¬ 
gests  the  conclusions  that  may  be  derived  from  the  overall  work. 

Volume  2:  A  Graphic  Design  Manual  for  C 

Volume  2  summarizes  our  systematic  approach  to  the  design  of 
program  presentation  from  a  graphic  design  perspective.  It  is 
therefore  a  graphic  design  manual  for  the  appearance  of  C  pro¬ 
grams  and  C  program  books. 

Volume  3:  Graphic  Design  Variations  of  C  Program 
Appearance 

Volume  3  presents  selected  examples  of  C  program  visualization 
that  can  be  realized  with  the  SEE  program  visualizer  and  that 
present  significant  variations  of  the  recommended  conventions. 

Volume  4:  Traditional  Listings  and  Documentation  for 
the  Eliza  Program 

Volume  4  presents  'lie  listings  and  documentation  for  a  program 
in  its  typical  form  of  appearance.  The  program  shown  is  Joseph 
Weizenbaum’s  famous  Eliza  program  [Weizenbaum,  1966].  Henry 
Spencer  of  the  Department  of  Zoology  of  the  University  of 
Toronto  has  implemented  this  new  version. 

Volume  5:  A  Prototype  Program  Book  of  the  Eliza 
Program 

Volume  5  illustrates  the  concept  of  the  program  as  a  publication. 

A  mock-up  of  a  prototype  program  book  of  the  Eliza  program 
appears.  Included  in  the  mock-up  is  the  primary  source  text,  the 
code  and  comments,  which  were  automatically  typeset  by  the  SEE 
program  visualizer. 

Volume  6:  A  Program  Visualization  Implementation 

Volume  6  describes  the  implementations  of  SEE  and  of  the  UNIX 
TROFF  [Kemighan.  1982]  typesetting  macro  packages  used  to  for¬ 
mat  program  visualization  text  and  programs. 
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Deliverables 

These  six  volumes  comprise  the  Final  Report  and  the  Graphic 
Design  Manual  to  be  delivered  to  darpa  as  per  the  Contract  Data 
Requirements  List  of  Contract  Number  F30602-82-C-0173.  In  par¬ 
ticular,  referring  back  to  the  Statement  of  Work,  Section  4.2,  the 
“typeset  examples"  of  Section  4.2. 1  are  included  in  our  Volumes  1 
through  3  and  5;  the  “program”  of  Section  4.2.2  is  described  in  our 
Volumes  1  and  6,  the  “Graphic  Design  Manual”  of  Section  4.2.3  is 
our  Volume  2;  and,  the  “report”  and  “image  sequences"  of  Section 
4.2.4  are  included  in  our  Volumes  2  through  5. 

A  Program  Visualization  video  tape  is  being  prepared  which  illus¬ 
trates  the  objectives,  goals,  method,  results,  and  significance  of  our 
work  in  a  more  informal  manner.  A  magnetic  tape  containing  the 
implemented  program  is  available  where  appropriate. 

Finally,  we  note  that  the  typeset  examples  in  Volumes  1,  3,  and  5 
wre  prepared  “almost  totally  automatically”  by  SEE.  Electronic  or 
manual  fix-ups  were  used  to  fix  three  bad  line  breaks  in  Volume  5, 
to  add  some  white  space  in  two  recurring  kinds  of  locations  in  Vol¬ 
umes  1  and  5,  to  fix  roughly  six  bad  page  breaks  in  Volumes  1  and 
5,  to  add  letratone,  an  occasional  bracket,  and  the  pointing  fingers 
that  appear  in  Volumes  1,  3,  and  5,  and  to  add  the  footnotes  shown 
in  Figure  50  of  Volume  3.  For  comparison  purposes,  fingers  have 
only  been  used  in  the  example  in  Volume  1,  the  first  five  figures  in 
Volume  3,  and  one  file  of  Eliza  in  Volume  5. 
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An  Example  of  the  Design  of 
Program  Appearance _ 


Our  example  consists  of  a  slightly  updated  version  of  a  desk  cal¬ 
culator  program  that  appears  in  a  standard  book  on  C  [Kemighan 
&  Ritchie,  1978]. 


The  program  is  shown  as  Figure  1  on  pages  16  through  18  as  it  is 
output  on  a  typical  dot  matrix  line  printer,  a  device  similar  to  that 
used  by  tens  of  thousands  of  programmers  of  microcomputers  and 
minicomputers.  Even  the  lightness  of  the  type,  caused  by  a  worn 
out  ribbon,  reflects  an  unfortunate  aspect  of  the  way  most  line 
printers  are  used.  This  of  course  impedes  legibility  and  readabil¬ 
ity. 


i-Viv.' 
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The  program  is  shown  again  as  Figure  2  on  pages  20  through  22. 
This  time  it  has  been  output  on  a  modem  laser  printer.  It  appears 
in  exactly  the  same  format  as  does  Figure  1,  and  again  uses  fixed 
width  type  in  a  single  font  at  a  single  point  size.  Legibility  and 
readability  are  somewhat  enhanced. 

Figure  3  on  pages  24  through  27  shows  the  output  from  the  cur¬ 
rent  version  of  the  SEE  processor  to  the  same  laser  printer  with  an 
appropriate  set  of  fonts.  The  C  program  was  not  modified  at  all  for 
input  to  SEE;  exactly  die  same  text  was  input  to  the  listing  program 
that  produced  Figures  1  and  2.  The  SEE  output  was  massaged  only 
in  the  introduction  of  some  white  space  to  improve  the  way  in 
which  the  program  is  paginated,  since  white  space  introduction  and 
pagination  are  not  yet  handled  automatically  by  SEE.  The  subtitles 
below  refer  to  categories  of  program  visualization  improvements 
discussed  later  in  this  volume;  the  numbers  in  the  margin  of  Figure 
3  refer  to  various  items  in  the  following  commentary: 

The  Presentation  of  Program  Metadata 

1.  The  program  is  presented  on  a  standard  8'/:xl  1  inches  page  that 
is  separated  into  four  regions,  a  header,  a  footnote  area,  a  code 
column,  and  a  marginalia  comment  column. 

2.  The  header  contains  key  document  metadata  describing  the 
context  of  the  source  code  that  appears  on  the  page,  including  the 
location  of  the  file  from  which  the  listing  was  made  and  the  page 
number  within  the  listing. 
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The  Spatial  Composition  of  Comments 

3.  Comments  that  are  external  to  function  definitions  are 
displayed  in  a  small-sized  serif  font  inside  an  outline  box.  There  is 
ample  margin  allowance  around  the  text  to  ensure  optimum  legibil¬ 
ity  and  readability. 

4.  Comments  that  are  internal  to  function  definitions  are  displayed 
in  a  small-sized  serif  font  appropriately  indented  and  marked  by  a 
left  vertical  bracket. 

5.  Comments  that  are  located  on  the  same  lines  as  source  code, 
which  we  call  marginalia  comments,  are  displayed  in  a  small-sized 
serif  font  in  the  marginalia  column.  These  items  are  intended  to  be 
short  single  line  phrases. 

The  Typography  of  Program  Punctuation 

6.  In  this  example  the  appears  in  10  point  regular  Helvetica 
type,  and  thus  uses  the  same  typographic  parameters  as  does 
much  of  the  program  code.  The  on  the  other  hand,  has  been 
set  in  bold  type,  and  the  has  been  enlarged  to  14  point.  These 
distinctions  highlight  the  difficulties  in  achieving  legible  punctua¬ 
tion  with  currently  available  typefaces.  The  bold  is  often  slightly 
too  heavy;  the  regular  weight  is  sometimes  too  easily  overlooked 
if  the  original  has  been  poorly  displayed  with  badly  adjusted 
equipment  or  if  it  has  been  degraded  through  photocopying.  In 
addition,  idiosyncratic  size  changes  for  particular  characters  in 
particular  fonts  are  often  desirable. 

7.  Symbols  such  as  the  “  +  +  ”  and  the  “ — "  have  been  kerned,  that 
is,  the  letter  spacing  of  individual  characters  overlaps  to  make 
them  more  legible  and  readable. 

8.  Symbol  substitutions  have  not  been  introduced  for  symbols  that 
clearly  need  improved  appearance,  e.g.,  the  “>=”,  and  “==”. 
Whether  or  not  these  substitutions  are  invoked  should  be  deter¬ 
mined  by  a  flag  under  control  of  the  user.  Legibility  criteria 
would  suggest  innovation;  however,  reader  familiarity  and  direct 
semantic  reference  to  two  input  keyboard  strokes  would  suggest 
the  conventional  alternative  that  we  currently  recommend.  For 
an  example  of  this,  see  Volume  3,  Figure  20,  page  28. 
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Typographic  Encodings  of  Token  Attributes _ 

9.  Most  tokens  are  shown  in  a  regular  sans-serif  font;  reserved 
words  are  shown  in  italic  sans-serif  type.  Bold  sans-serif  is  used 
to  highlight  global  ( extern )  variables  (see  22). 

10.  String  constants  are  shown  in  a  small-sized  serif  font. 

The  presentation  of  Preprocessor  Commands _ 

1 1.  The  “#”  signifying  a  preprocessor  command  is  exdented  to 
enhance  its  distinguishability  from  ordinary  C  source  text. 

12.  Macros  and  their  values  are  presented  at  appropriate  horizon¬ 
tal  tab  positions. 

The  Presentation  of  Declarations  _ 

13.  Identifiers  being  declared  are  aligned  to  a  single  implied  verti¬ 
cal  line  located  at  an  appropriate  horizontal  tab  position. 

The  Visual  Parsing  of  Expressions _ 

14.  Parentheses  and  brackets  are  emboldened  to  call  attention  to 
grouped  items.  Nested  parentheses  are  varied  in  size  to  aid  the 
parsing  of  the  expression. 

15.  The  word  spacing  between  operators  within  an  expression  is 
varied  to  aid  the  visual  parsing  of  the  expression.  Operands  are 
displayed  closer  to  operators  of  high  precedence  than  to  operators 
of  low  precedence. 

The  Visual  Parsing  of  Statements _ _ 

16.  Systematic  indentation  and  placement  of  key  words  is 
employed. 

17.  Since  curly  braces  are  redundant  with  systematic  indentation, 
they  are  removed  in  this  example.  Whether  this  happens  or  not  is 
determined  by  a  flag  under  control  of  the  user. 


18.  “Unusual"  control  flow  is  marked  with  pointing  figures  located 
in  the  margin. 
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Figure  1:  A  listing  of  a  simple  desk  calculator  program  produced  on 
a  dot  matrix  line  printer 

(See  next  3  pages.) 
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Aug  30  11:49  1985  calcic  Pag*  1 


This  nvtrii  Polish  dssk  calculator  adds,  subtracts,  siultipliss  and 
dividss  floating  point  numbers.  It  also  ollows  th*  commands  to 
print  th*  value  of  th*  top  of  th*  stack  ond  ’c1  to  clsor  th*  stock. 


finclud*  <stdio.h> 
f define  MAXOP  20 
f define  NUMBER  '0' 
fdefine  TOOBIG  '9' 


/•  max  six*  of  operand,  operator  •/ 
/•  signal  that  number  found  •/ 

/•  signal  that  string  is  too  big  •  / 


Control  Module 


ealc() 


i nt  type ; 
char  s [MAXOP ] ; 
double  op2. 

atof  ( )  , 
pop( )  . 
push  (  )  ; 


/•  operation  type  •/ 

/•  buffer  containing  operator  •/ 

/•  temporary  variable  •/ 

/•  converts  strings  to  floating  paint  •/ 
/•  pops  th*  stock  •/ 

/•  pushes  th*  stack  •/ 


/•  loop  whi I*  we  can  get  an  operation  string  and  type  •  / 

while  ((type  -  gctop(s,  MAXOP))  I-  EOF) 
switch  (type)) 
case  NUMBER: 

push(atof(s)); 
break ; 
case  ’+’ : 

push ( pop (  )  +  pop()): 
break ; 
cos*  '  •  '  : 

push(pop( )  •  pop()); 
break; 
case  ' - ’ : 

op2  -  pop( )  ; 
push(pop(  )  -  op2 )  ; 
break  ; 
case  ' / ’ ; 

op2  -  pop()  ; 
i f  (op2  !■  0.0) 

push  (pop()  /  op2); 

else 

printf(“zero  divisor  popp*d\n"); 

break ; 

ca  s*  ’ “ ' 

pr i nt f ( "\tXf \n“ .  push ( pop( ) ) ) ; 
break ; 
case  ’ c ’ : 

c I *a  r (  )  ; 

break  ; 

case  TOOBIG: 

p r i n t f ( “X . 20s  ...  is  too  long\n".  s); 

break ; 

default; 

p r i n t f (" un known  command  %c\n".  type); 
break ; 

I 


fdef ine  MAXVAL  100 

i  n  t  sp  «  0 ; 
double  va I [MAXVAL ] ; 


Stack  Management  Module 

/•  maximum  depth  of  vol  stock  •/ 


/•  stock  pointer  •/ 

/•  value  stack  •/ 

/•  push  f  onto  value  stack  •/ 


double  push(f)  /•  push  f  onto  valur 

double  f  ; 

i f  (sp  <  MAXVAL) 

return  (vol [ip++]  -  f); 

else  | 

printf(” error;  stack  f u I  I \ n " ) ; 
clear)); 


.  >J.vy*'.vVr' V.-.-.V  .  . 

,  VS.,  '  V*  .  •  ll.  V  «.  .  .  . 


V.A 


if.-WW  ">i ?.  > 1  ■  i.< . «  -  r;*.! r.  v.,^  t.v.wji'w.w 


Aug  30  11:49  1 90S  colcl  c  Pag*  2 
roturn(t) ; 


Page  17 


I 


I 


/•  pop  lop  volu*  fro*  otock  •/ 


doubl*  pop() 

if  ( op  >  0) 

roturn(vol[~ op]); 

olo*  f 

pr i nt f ( “or ror :  otock  *mpty\n"); 

cloor(): 

ret  urn(0) ; 

I 


I 


cloor() 

» 


/•  cloor  otock  •  / 


op  -  0; 


Input  Modulo 


g*top(o.  I i  m) 
char  *[]; 
i nt  I  im ; 


I 


/•  9«»  next  operator  or  operand  •  / 
/•  operator  buffer  •/ 

/•  oil*  of  input  buffer  •  / 


i  nt  i ,  c ; 

/•  okip  blank*,  tab*  and  newline*  •/ 
while  ((c  •  g*tch())  c  ■■  ’\t’ 


'\n '  ) 


/•  return  if  not  o  number  •  / 


if  (e  I-  "  ’  kb  (c  <  ’O’  II  c  >  •»•)) 
return(c) : 

*[0]  -  c:  • 


/•  get  root  of  number  •  / 

■  «• 

I  im 

•I  i 


for  (i  ml;  (c  «  getchorQ)  >“  *0’  kb  c  <“  ,9’;  i++) 

if  ( i  <  I im) 

•| i )  -  c; 

if  (c  —  ’.•)  |  /•  collect  fraction  •/ 

if  ( i  <  I im) 

*[ i ]  "  c  ; 

for  (i++;  (c  •  g*tchor())  >-  ’0‘  kb  c  <•  '9':  i+e) 
if  ( i  <  I im) 

*[i]  -  c; 


/•  number  i*  ok  •/ 


if  (  i  <  I im)  | 

ungetch(c) ; 

•  [  i  1  •  .’\0’  '•  . 

r * t u rn ( NUMBER) ; 

I  elo*  |  /•  it's  too  big;  okip  reot  of  line  •/ 

whil*  (c  !»  ’\n‘  kb  c  !•  EOF) 
c  “  getchar(); 

* [ I i m  -  I]  -  ’\0'; 
return(TOOBIG) ; 

I 


f def in*  BUFSIZE  100 


char  buf [BUFS I ZE ] ; 
i  nt  buf p  ■  0; 


/•  buffer  for  ungetch  •/ 

/•  ne  fro*  position  in  buf  •/ 


I*teh(  ) 


/•  get  a  (possibly  pushed  back)  choracter  •  / 


I 


r*turn((bufp  >  0)  ?  buf[ — bufp]  g*tchar()); 


unge  t  c  h ( c  ) 
i  nt  c  ; 

I 


/•  push  character  back  on  input  •  / 


•  *  . 
-  "  * 
-  v 


cay*  i  tkikii 


s-:a 


.-v*« 

.-vs 


V.v/.' 


kV.V- 

• . ' . 


.v 

»-*.  .w* 


•  yy wt 

•  .•  v,\\ 


•-» 
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Figure  2:  A  listing  of  the  desk  calculator  program  produced  on  a 
laser  printer 


(See  next  3  pages.) 
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This  (••verse  Polish  dash  calculator  adds,  subtracts,  multiplies  and 
divides  -floating  point  numbers.  It  also  allows  the  commands  to 
print  the  value  o-f  the  top  of  the  stack  and  *c’  to  clear  the  stack. 


•include  (stdio.h) 
•define  MAXOP  20 
•define  NUMBER  ’O’ 
•define  T00BIQ  •?’ 


/•  man  size  of  operand,  operator  •/ 
/•  signal  that  number  found  •/ 

/•  signal  that  string  is  too  big  •/ 


Control  Moduli 


ca 1 c ( > 

< 


mt  types 
char  s  C MAX OP 2 s 
double  op2. 

atof ( ) . 
pop ( ) . 
push ( ) : 


/•  operation  type  •/ 

/•  buffer  containing  operator  •/ 

/•  temporary  variable  •/ 

/•  converts  strings  to  floating  point 
/*  pops  the  stack  */ 

/*  pushes  the  stack  •/ 


/*  loop  while  we  can  get  an  operation  string  and  type  */ 

while  ((type  *  getopfs.  MAXOF ) )  I*  EOF) 
switch  (type) < 
case  NUMBER: 

push (atof ( s) ) ? 
break  S 
case  ’+•: 

push  (pop  ()  +  pop  ( >  )  s 
break ; 

case  ’ 

push (pop ()  *  pop ( ) ) s 

break  S 

case 

op2  »  pop  < ) ; 
push (pop ()  -  op2) s 
break ; 
case  ’ / • : 

op2  =  pop  < ) ; 

if  (op2  ;■  o.o) 

push  (pop()  /  op2) s 

else 

printf (":ero  divisor  popped\nM); 

break  s 

case 

printf  C'\t7.f\n"i  push  (pop  ( )  )  )  s 
break ; 
case  ’  c’: 

c lear  ( ) * 
break  $ 

case  T00BIG: 

pr  int  f  (”■/..  20s  ...  is  too  longNn”.  s)5 
break ; 

default : 

printf  (“unknown  command  7.c\n".  type); 
break ; 


/*  collect  fraction  */ 


if  <c  **  ’  .  ’ )  < 

i-f  <i  <  lim) 

sC i 3  *  cl 

■for  (i+-*t  (c  ■  getcharO)  )  *»  ’O’  !*Zt  c  <*  ’9’;  i++) 
i-f  ( i  <  lim) 

SCI]  a  c? 

> 

if(i(lim)-C  /*  number  is  ok  */ 

ungetch (c) 5 

ic n  *  1 \o’ ; 

return (NUMBER) 5 

>  else  <  /*  it’s  too  big;  skip  rest  o-f  line  */ 

while  (c  !-  •  \n  *  c  !■  EOF) 
c  *  get char ( ) ; 
sClim  -  n  *  ’  \0’  ; 
ret  urn (TQOBIO) ; 

> 


r 


•  .  *  .  -  .  « 

*  *  *  i  * 


>  ■/  ■, 


#define  BUFSIZE  100 

char  buf CBUFSXZE3; 
i.nt  buf p  ■  0; 


/*  buffer  for  ungetch  */ 

/*  next  free  position  in  buf  */ 


SP 

.•  ■-  v 

^ *  *■ 


getch ( ) 

< 


/*  get  a  (possibly  pushed  back)  character  */ 


return ( (bufp  >  0)  ?  bufC — bufpl  :  getchar(>); 


ungetch  (c)  /*  pish  character  back  on  input  */ 

int  c; 

< 

if  (bufp  >  BUFSIZE). 

pr  mt  f  <  "ungetch:  too  many  characters'^”  )  ; 

else 

buf  Cbufp-r-n  =  c; 


K-\v. 

•  N  *  •  f  • 

*  h  »  *  a 

.•  s  -1  O' 

• » •. vr. 
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/«  Stack  Management  nodule 

•define  MfiXVAL  100  /•  maximum  depth  of  val  stack  */ 


lot  sp  •  0»  /•  stack  pelntar  •/ 

double  valCMAXVALlt  /•  value  stack  •/ 


double  push (f  >  /•  push  f  onto  value  stack  •/ 

double  ft 

< 

If  (sp  <  MAX VAL) 

return  (valCsp++3  •  f>» 

else  < 

prlntf ("error »  stack  full\n")t 
clear () I 
return  <0>  t 

> 

> 

double  popO  /•  pop  top  value  froe  stack  •/ 

< 

if  (sp  >  0) 

return  (valC-— sp3>  t 

else  < 

printf  ("error:  stack  empty\n")t 
clear ( ) t 
return (0) t 


clearo  /*  clear  stack  •  / 

{ 

sp  »  Ot 

> 


•/ 


t  •  Input  Module  •/ 

get  op  ( s .  lini) 
char  sCli 
int  lims 
< 

int  i>  ct 

/•  skip  blankst  tabs  and  newlines  «/ 

while  ( (e  ■  getcnc))  •  •  ::  c  ’Vt*  new  'vn') 
t 

/•  return  if  not  a  numoer  •/ 

if  <c  ! ■  &&  (c  (  ’O'  : i  c  >  ’S*)) 

return  <c>  t 
SCO!  »  C! 

/«  get  rest  of  number  •/ 

for  'i  »  i:  (  c  »  g»trnjir()t  > »  c  <■ 

if  < i  (  1 im ) 

sC i 1  »  e: 


f*  get  next  operator  or  operand  •/ 
/*  operator  buffer  •/ 

/•  size  of  input  buffer  •  / 


! 
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Appearance 


Figure  3:  The  desk  calculator  program  produced  on  a  laser  primer 
using  the  SEE  program  visualizer 
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Chapter  1  CSllcl.C 


This  reverse  Polish  desk  calculator  adds,  subtracts,  multiplies  and 
divides  floating  point  numbers.  It  also  allows  the  commands  '=’  to 
print  the  value  of  the  top  of  the  stack  and  V  to  clear  the  stack. 


Max  size  of  operand,  operator 
Signal  that  number  found 

Signal  that  string  is  too  big 

#  Include 

#  define 

#  define 

#  define 

<stdio.h  > 
MAXOP 
NUMBER 
TOOBIG 

20 

'0' 

■9' 

Control  Module 

calco 

Operation  type 

mt 

type; 

Buffer  containing  operator 

char 

s[MAXOP]; 

Temporary  variable 

double 

op2. 

Converts  strings  to  floating 

atofO. 

point 

Pops  the  stack 

popO. 

Pushes  the  stack 

pushO: 

Loop  while  we  can  get  an  operation  string  and  type 

while  ((type  =  getop(s,  MAXOP))  !=  EOF) 
switch  (type) 
case  NUMBER: 
push(atof(s)); 
break  ; 

case 

push(pop()  +  popO). 
break  ; 

case 

push  (pop  ()  *  popO); 
break  ; 

case 

op2  =  pop(); 
push(pop()  -  op2); 
break  ; 
case  */*: 

op2  =  pop(); 
if  (op2  !-  0.0) 

push  (popO  /  op2); 

else 
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k  '/•  v* , 


Push  F  onto  value  slack 


Pop  top  value  from  suck 


Clear  stack 


print!  ("\tw\n",  push(pop())); 
break  ; 
case  'c': 
clear  (); 
break  ; 
case  TOOBIG: 

printf('*V20s  ...  is  too  long\n',  s); 

break  ; 
default  : 

print! ("unknown  command  V«c\n",  type); 
break  ; 


Maximum  depth  of  val  stack 

#  deline 

Stack  pointer 

int 

Value  stack 

double 

double 

Stack  Management  Module 

iefine  MAXVAL 


>USh(!) _ 

double 

if  (sp  <  MAXVAL) 

return  (val[sp++J  =  !); 

else 

printt(”crror.  stack  full\n"); 

clear(); 
return  (0); 


double 

popO _ 

if  (sp  >  0) 

r  return  (val[“sp]); 

else 

printt(”error  stack  empty'n’); 

clear  (); 

r  return  (0); 


clearo 


100 

sp  =  0; 

val[MAXVAL] 
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Get  next  operator  or  operand 
Operator  buffer 
Size  of  input  buffer 


Input  Module 

getop(s.iim) 


Collect  fraction 


Number  is  ok 


lt*s  too  big.  skip  rest  of  line 


Skip  blanks,  tabs  and  newlines 

while  ((c  =  getch())  —  "  ||  c  =  M'  II  c  ==  "No'); 

Return  if  not  a  number 

if  (c  !=  V  &&  (c  <  'O'  II  c  >  '9')) 
return  (c); 
s[0]  =  c; 

Get  rest  of  number 

for  (i  =  1 ;  (c  =  getcharO)  >=  'O’  &&  c  <=  *9';  i++) 
if  (i  <  lim) 
s[i]  =  c; 

if  (c  == 

if  (i  <  lim) 
s[i]  =  c; 

for  (i++;  (c  =  getcharO)  >= 'O’  &&  c  <= '9';  i++) 
if  (i  <  lim) 
s[i]  =  c; 

if  (i  <  lim) 

ungetch(c); 
s[i]  =  'XOOO'; 
return  (NUMBER); 

else 

while  (c  !=  \n’  &&  c  !=  EOF) 
c  =  getcharO; 
s  [I  im  -  1]  =  V)00'; 
return  (TOOBIG); 


Buffer  for  ungetch 
Next  free  position  in  buf 


Get  a  (possibly  pushed  back) 


#  define 

char 

int 


BUFSIZE 


100 

buflBUFSIZE]; 
bufp  -  0; 


jetcho _ 

return  ((bufp  >  0)  ?  buf[“bufp]  :  getcharO); 


f w-rr 


V*'*v,'r  fti*. .■  r> 1  v.1  w. tj r. w. i 


'  V'P  "”T1 
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Cj&Zt 


Push  character  back  on  input 


ungetch(c) 


int 


c; 


if  (bufp  >  BUFSIZE) 

printf  ("ungetch  too  mans  characters -n‘) , 

else 

buf[bufp*+]  =  c, 


r..-  • 
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C  Program  Books 


A  program  book  would  typically  be  composed  of  primary,  secon¬ 
dary.  and  tertiary  texts  structured  into  five  parts  (see  Figure  4) 


The  book  begins  with  secondary  text  known  as  the  “front  mat 
ter”  This  may  include  a  cover  page,  title  page,  copyright  page, 
abstract,  authors  and  personalities  page,  and  program  history 
page 

Chapter  1  is  the  tertiary  text  that  comprises  the  user  documen¬ 
tation:  the  command  summary  and  manual  page,  the  tutorial 
guide,  and  the  reference  manual 

—  Chapters  2  through  n-  1  constitute  the  primary  text,  the  pro¬ 
gram  code  and  comments.  Each  file  of  the  n  files  in  the  pro¬ 
gram  appears  in  a  separate  chapter.  Each  program  page  has 
various  metadata  and  commentaries  included  in  its  header  and 
footer. 

Chapter  n-*-2  contains  more  secondary  text,  various  indices  and 
overviews.  These  ma\  include  program  metrics,  program  sig¬ 
natures  and  condensations,  a  cross  reference  index,  a  key  word 
in  context  index,  a  call  hierarchy,  and  various  other  diagrams. 

Chapter  n--3  includes  the  remaining  part  of  the  tertiary  text, 
the  programmer  documentation,  the  installation  guide  and 
Hi  ADM  I-  file,  the  “make"  file,  and  the  maintenance  guide. 


W  hereas  any  listing  or  representation  of  the  program  or  of  a  piece 
of  it  will  contain  primary  text,  some  or  most  of  these  secondary 
texts  can  and  will  be  omitted  in  a  "quick  end  dirty"  look  at  a  pro¬ 
gram  that  is  likely  to  be  changed  almost  immediately,  as  is  the  case 
when  one  is  creating  or  debugging  code. 

The  tertiary  text  is  the  source  of  still  additional  information  about 
the  program,  how  it  was  built,  and  how  it  is  to  be  used.  Even  more 
so  than  in  the  case  of  secondary  text,  the  investment  in  the  produc¬ 
tion  of  tertiary  text  is  most  easily  justified  if  the  program  has  con¬ 
siderable  readership  and  longevity. 
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The  Program 


Support  Documentation 


Figure  4:  The  structure  of  a  program  book 


Program  Book 


Commentary 


Front  Matter 


Tertiary  Text 


Overview* 


Indices 


User 

Documents 


Programmer 

Documents 


1 

Primary  Text 

Source  Code 

Comments 
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Section  3.1  Secondary  Text:  Front  Matter 


Cover  Page 

A  program  published  in  book  form  may  need  a  cover  page  identi¬ 
fying  the  book  and  depicting  it  with  an  attractive  illustration. 

Title  Page 

The  program's  title  page  presents  the  most  important  metadata, 
such  as  the  program’s  title,  author,  company  and  address  of  the 
author,  version,  date,  publishing  source,  and  level  of  confidential¬ 
ity. 

Colophon 

The  program's  colophon  presents  production  information,  details 
about  the  typesetting,  printing,  and  distribution  of  the  document. 

Abstract _  _ _ _ 

An  abstract  of  the  program  summarizes  what  it  does,  how  it 
accomplishes  it,  and  why  it  does  it. 

Program  History 

A  design  history  presents  the  history  of  the  system  from  concep¬ 
tion  to  implementation  through  recent  modification.  As  program 
genealogy,  it  may  also  e  invaluable  in  understanding  apparently 
nonsensical  constructs  and  bizarre  artifacts. 

Authors  and  Personalities 

This  page  lists  the  authors  and  other  important  personalities  (e.g., 
augmenters  and  maintainers)  associated  with  the  program,  gives 
their  postal  and  network  addresses,  their  phone  numbers,  and 
potentially  also  their  photographs  [Pike,  1985]. 

Table  of  Contents 

The  table  of  contents  enumerates  the  major  parts  of  the  program. 
In  the  case  of  a  program  operating  under  the  UNIX  operating  sys¬ 
tem.  for  example,  it  would  probably  list  the  directories  and  files 
and  possibly  also  the  defined  functions. 
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Section  3.2  Tertiary  Text:  User  Documentation 


Command  Summary  and  Manual  Page _  _ 

A  summary  of  commands  is  essential  for  every  user  of  any  sys¬ 
tem.  In  the  UNIX  world,  this  command  summary  is  often  included 
in  the  manual  page,  or  “man  page".  By  convention,  one  such  page 
is  written  to  correspond  to  each  UNIX  utility  or  command  installed 
on  the  system. 

Tutorial  Guide 

A  tutorial  guide  presents  a  step-by-step  introduction  to  the  usage 
of  the  major  features  of  the  system. 

Reference  Manual 

A  reference  manual  is  a  comprehensive  information  source  on  all 
features  of  the  system. 
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Section  3.3 


I 


I 


i 

I 


w 

I 


Primary  Text:  The  Program 


The  primary  text  is  the  program  itself.  Its  appearance  is  the  topic 
of  the  next  Chapter  of  this  report.  Each  file  of  the  program  is 
represented  by  a  number  of  program  pages.  These  pages  each 
include: 

Program  Code  _ _ 

The  “program  books”  of  today,  known  as  listings,  often  contain 
only  code. 

Program  Comments 

Comments  appear  in  various  forms  and  locations  on  the  page,  as 
discussed  in  Chapter  4.2  of  this  volume. 


\mv  * 1 1 MM1 


7. 


7. 


\TIT 
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Secondary  Text:  Metadata  and  Commentaries 


Also  located  on  the  program  pages  are  two  kinds  of  secondary 
text,  selected  metadata  and  program  cross-reference  information. 

Program  Page  Headers 

Program  page  headers  include  selected  metadata  under  the  con¬ 
trol  of  the  user  requesting  the  listing. 

Program  Page  Footnotes 

Program  page  footnotes  should  include  cross-references  to  the 
definitions  of  identifiers  declared  “externally”  to  that  particular 
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Section  3.5  Tertiary  Text:  Indices  and  Overviews 


Program  Metrics _ 

A  list  of  metrics  [Gilb,  1976;  Perlis.  Sayward  &  Shaw,  1981]  would 
include  numerical  tables  and  charts  encapsulating  significant  pro¬ 
perties  or  qualities  of  the  program.  Software  engineers  and  human 
factors  specialists  must  determine  their  proper  content. 

Program  Signatures  and  Condensations _ 

Program  signatures  and  program  condensations  are  visual  repre¬ 
sentations  of  the  code  that  compress  the  text  into  small  diagrams  or 
symbols.  These  allow  a  viewer  to  quickly  scan  many  pages  of  a 
program. 

Cross  Reference  Index 

Cross  reference  listings  detail  where  every  identifier  is  declared 
and  all  instances  of  its  use. 

Key  Words  in  Context  Index  _ _ 

Key  word  in  context  listings  show  ail  program  phrases  alphabeti¬ 
cally  in  the  context  of  their  surrounding  text. 

Call  Hierarchy 

A  call  hierarchy  diagram  shows  the  nesting  of  function  calls. 

Other  Diagrams 

Various  other  diagrammatic  representations  [Martin  &  McClure. 
1985]  that  portray  the  structure  of  the  program  should  also  be 
included. 
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Section  3.6  Tertiary  Text:  Programmer  Documentation 


The  Installation  Guide  and  README  File 

An  installation  guide  contains  instructions  on  how  to  install  a  sys¬ 
tem.  In  a  UNIX  distribution,  it  is  typically  part  of  a  “README"  file. 
In  the  UNIX  world,  a  README  file  is  by  convention  included  on  any 
tape  containing  a  software  distribution.  This  file  is  the  first  read  by 
the  programmer  upon  receipt  of  the  system,  and  thus  should  be  a 
guidebook  to  what  is  in  the  distribution. 

The  Make  File 

In  the  UNIX  world,  the  “make”  file  is  used  by  the  UNIX  “make" 
program  to  facilitate  system  recompilation  and  regeneration. 

Maintenance  Guide 

The  maintenance  guide  contains  instructions  on  how  to  maintain 
the  system.  It  is  thus  an  additional  commentary  on  the  program. 
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Chapter  4: 
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chapter  4  Graphic  Design  of  C  Source  Code 
_  and  Comments _ 

Our  goal  in  the  research  was  to  apply  the  full  palette  of  graphic 
design  techniques  to  reveal  and  express  the  meaning  of  C  pro¬ 
grams.  We  worked  on  ten  specific  problems  and  explored  various 
methods  for  displaying  the  following: 

The  Presentation  of  Program  Metadata 

Enhancing  the  display  of  a  program  in  relationship  to  the  relevant 
data  describing  the  context  in  which  the  program  was  created,  is 
maintained,  and  will  be  used. 

The  Spatial  Composition  of  Comments 

Presenting  program  comments  clearly  in  relationship  to  program 
code. 

The  Typography  of  Program  Punctuation 

Enhancing  the  visual  effectiveness  of  C  punctuation  marks  (separa¬ 
tors,  containment  symbols,  and  operators). 

Typographic  Encodings  of  Token  Attributes 

Mapping  C  tokens  (identifiers,  reserved  words,  and  constants)  into 
effective  typographic  representations. 

The  Presentation  of  Preprocessor  Commands 

Presenting  C  preprocessor  commands  in  a  more  effective  manner. 

The  Presentation  of  Declarations 

Enhancing  the  structure  of  the  declarations  of  C  identifiers. 

The  Visual  Parsing  of  Expressions 

Using  typographic  attributes  to  enhance  the  ability  of  a  human 
reader  to  identify  and  understand  complex  program  expressions. 
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Section  4.2  The  Spatial  Composition  of  Comments 


Traditional  methods  of  structuring  programs  pay  little  attention  to 
developing  and  enhancing  the  content  and  method  of  presenting 
comments  in  relationship  to  code.  Comments,  if  added  at  all,  are 
often  an  afterthought,  an  unpleasant  reminder  that  management 
is  concerned  about  issues  of  program  readability  and  maintaina¬ 
bility.  Nor  is  the  process  of  creating  comments  and  integrating 
them  with  code  facilitated  by  the  interactive  text  editors  and  pro¬ 
gram  development  environments  commonly  available. 

In  our  research  we  were  unable  to  deal  with  the  management 
issues  implied  by  the  legislation  of  adequate  comments  nor  with 
the  literary  and  stylistic  concerns  of  making  comments  both 
appropriate  and  meaningful.  Instead,  we  have  been  concerned 
with  presenting  comments  for  maximum  effect,  both  in  isolation 
and  in  relationship  to  code. 

To  distinguish  and  highlight  comments,  we  have  distinguished 
external  comments  (those  outside  a  function  definition),  internal 
comments  (those  within  a  function  definition,  which  appear  on 
their  own  line  in  the  input  text),  and  marginalia  (those  within  a 
function  definition,  but  which  do  not  appear  on  their  own  line). 
The  typographic  variations  that  we  have  considered  or  explored 
include: 

—  Comments  integrated  with  code  in  a  one  column  format;  com¬ 
ments  strictly  separated  from  code  in  a  two  column  format: 
and  various  mixtures  of  one  column  and  two  column  formats. 

—  Assuming  a  two  column  format,  code  on  the  left  with  comments 
on  the  right,  or  code  on  the  right  with  comments  on  the  left. 

—  Assuming  a  two  column  format,  variations  in  the  width  of  the 
code  in  relation  to  the  width  of  the  comments,  for  example,  2:1 
or  3:1. 

—  Use  of  the  same  font  for  code  and  comments,  use  of  variations 
of  one  font  (roman,  bold,  italic),  and  use  of  three  different  fonts 
(for  example,  a  square-serif  font  such  as  American  Typewriter, 
a  serif  font  such  as  Times  Roman,  and  a  sans-serif  font  such  as 
Helvetica). 

—  Variations  in  the  point  size  and  leading  of  the  comments  rela¬ 
tive  to  the  point  size  of  the  code. 

—  Use  of  various  diagrammatic  notations,  such  as  leader  lines. 
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Comments  Comments 

arrows,  or  connecting  braces,  to  indicate  connectivity  between 
code  and  comments. 

—  Use  of  various  gray  scale  tints  overlayed  on  regions  containing 
various  kinds  of  comments. 

—  Use  of  various  kinds  of  rules  and  boxes  to  delimit  regions  con¬ 
taining  various  kinds  of  comments. 
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The  Typography  of  Punctuation _ 

The  punctuation  marks  of  computer  programs  consist  of  separa¬ 
tors  such  as  and  containment  symbols  such  as  “(”  and 
and  operators  such  as  and  The  legibility  of  punctua¬ 

tion  marks  in  program  text  is  a  critical  component  affecting  the 
comprehensibility  of  a  program,  much  more  so  than  the  legibility 
of  English  language  punctuation  affects  the  comprehensibility  of  a 
passage  in  English. 

We  have  therefore  considered  or  experimented  with  various  meth¬ 
ods  of  enhancing  the  legibility  of  program  punctuation,  including: 

—  Emboldening  and/or  enlarging  punctuation  marks. 

—  Kerning  compound  (multicharacter)  operators. 

—  Substituting  symbols  that  are  more  legible. 

It  is  obvious  that,  for  C  code,  the  ratio  of  punctuation  marks  to 
alphabetics  and  numerics  is  quite  different  than  for  prose  text. 
Unfortunately,  no  typeface  currently  exists  that  has  been  optim¬ 
ized  for  use  in  representing  computer  programs. 
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Section  4.4 


Typographic  Encodings  of  Token  Attributes 


Current  attempts  at  program  visualization  often  employ  crude 
mechanisms  for  distinguishing  typographically  one  kind  of  token 
from  another.  Reserved  words  are  often  shown  in  bold  face;  man¬ 
ifest  constants  are  often  named  using  capital  letters  only.  These 
attempts,  typical  of  many  prettyprinting  programs,  represent  but 
a  small  fraction  of  the  wealth  of  the  purely  typographic  possibili¬ 
ties  for  enhancing  the  legibility  and  readability  of  programs.  The 
optimum  encoding  is  a  complex  synthesis  of  the  reader’s  needs  for 
clarity  when  scanning  the  text  with  a  variety  of  search  motives 
and  when  examining  the  text  slowly  and  in  detail.  Unfortunately, 
extensive  data  on  programmer’s  reading  patterns  is  not  yet  avail¬ 
able  in  th'.  literature  of  computer  science  or  visible  language. 


We  have  experimented  with  mappings  from  C  token  attributes  to 
typographic  attributes.  We  first  organized  C  token  attributes 
according  to  a  token  hierarchy.  This  procedure  allowed  us  to  dis¬ 
tinguish  typographically  the  following  classes: 


Comments  (see  Section  4.2) 

External  comments 
Internal  comments 
Marginalia  comments 
Punctuation  tokens  (see  Section  4.3) 

Separator  symbols 
Containment  symbols 
Operators 
Simple  operators 
Compound  operators 
Other  tokens 
Reserved  words 

Preprocessor  reserved  words  (see  Section  4.5) 
Declarative  reserved  words 
Control  reserved  words 
Control  flow  altering  reserved  words 
Variables 
Local  variables 
Global  variables 
Static  variables 
Preprocessor  macro  names 
Manifest  constants 
Other  macros 
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Other  identifiers 
Function  names  in  declarations 
Function  names  in  use 
Typedef  names 
Type  tags 

Structure  and  union  tags 
Structure  and  union  member  names 
Enumeration  tags 
Enumeration  constants 
Statement  labels 
Constants 

Integer,  floating  point,  and  character  constants 
String  constants 

We  then  considered  or  experimented  with  the  visible  language 
appearance  of  these  token  attributes  to  achieve  optimum  legibility 
and  readability.  Attributes  used  in  the  encodings  included  the  fol¬ 
lowing: 

—  Choice  of  typeface,  for  example,  Helvetica,  Times  Roman,  or 
American  Typewriter. 

—  Choice  of  weight,  for  example,  medium  or  bold. 

—  Choice  of  proportion,  for  example,  condensed,  normal,  or 
extended. 

—  Choice  of  slant,  for  example,  roman  or  italic. 

—  Choice  of  point  size,  for  example,  8,  10,  or  14  point. 

—  Use  of  capitals  or  lower  case,  for  example,  all  capitals,  all  lower 
case,  initial  capitals,  small  capitals,  embedded  capitals,  and 
standard  prefixes  (such  as  “#”). 

—  An  overlayed  gray  screen  tint,  or  reversed  type  (white  on 
black). 
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We  have  considered  or  experimented  with  additional  encoding 
and  differentiation,  for  example: 

—  Use  of  typographic  attributes  such  as  described  in  the  preced¬ 
ing  section. 


—  Use  of  positional  encodings  such  as  locating  all  preprocessor 
commands  at  the  left  margin  or  even  exdenting  them  so  that 
the  is  in  the  margin. 

—  Use  of  definitional  encoding,  i.e.,  showing  the  macro  call  in 
relationship  to  the  text  into  which  it  expands. 


The  lexical  structure  of  C  encodes  all  preprocessor  commands 
with  a  prepended  In  addition,  a  standard  convention  for  C 
programming  is  the  use  of  all  capitalized  letters  to  differentiate 
preprocessor  identifiers  (such  as  manifest  constants)  from  all 
other  tokens. 
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The  Presentation  of  Declarations 


Thus  far  we  have  considered  only  a  program’s  imperative  state¬ 
ments.  i.e.,  statements  that  transform  existing  data  to  produce  new 
data.  However,  much  of  a  program’s  intractability  often  occurs  in 
the  declarative  aspects,  i.e.,  the  declaration  of  variables  as 
instances  of  particular  data  types  and  the  initialization  specifying 
values  for  certain  variables.  Again,  the  issue  is  complicated  by 
the  fact  that  programs  are  often  scanned  for  a  variety  of  motives. 

We  considered  or  experimented  with  various  methods  of  using 
rules  and  tabular  typesetting  to  enhance  the  legibility  and  reada¬ 
bility  of  complex  C  data  declarations,  type  definitions,  and  data 
initialization.  These  typographic  techniques  included: 

—  Consistent  use  of  line  spacing,  underline  rules,  and  gray  screen 
tints  to  distinguish  sequences  of  similar  lines. 

—  Multi-column  setting  of  long  sequences  of  short  declarations  or 
of  lengthy  initialization  text. 

—  Tabular  setting  of  sequences  of  declarations  of  variables  of 
simple  type. 

—  Tabular  setting  of  declarations  of  variables  of  complex  type. 
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Section  4.7 _  The  Visual  Parsing  of  Expressions 


One  of  the  most  difficult  aspects  of  the  detailed  reading  of  a  com* 
puter  program  occurs  in  the  attempt  to  parse  a  complex  (arithme¬ 
tic  or  logical)  expression.  This  is  particularly  true  in  the  program¬ 
ming  language  C,  where  46  different  operators  occur  at  16  levels 
of  precedence,  some  associating  left  to  right,  others  associating 
right  to  left  [Harbison  &  Steele,  1984].  Current  methods  of  pro¬ 
gram  visualization  provide  little  help  to  the  reader  trying  to  deci¬ 
pher  an  expression  other  than  the  explicit  indication  of  nesting 
and  grouping  through  the  inclusion  of  parentheses.  The  resulting 
visual  clutter  and  masking  of  what  is  essential  is  readily  apparent 
in  languages  such  as  LISP. 

We  considered  or  experimented  with  various  methods  of  using 
typographic  attributes  to  enhance  the  legibility  and  readability  of 
complex  C  expressions.  These  typographic  techniques  included: 

—  Use  of  ligatures,  kerning,  and  other  controls  over  letter  spacing 
to  bind  tokens  together  more  tightly. 

—  Controls  over  word  spacing. 

—  Variations  of  the  point  size  of  operators. 

—  Variations  of  the  weight  of  operators. 

—  Control  over  the  vertical  placement  of  unary  operators. 

—  Variations  in  the  point  size  of  parentheses. 

—  Use  of  light  square  under-brackets  or  other  diagrammatic  nota¬ 
tions. 

—  Explicit  introduction  of  line  breaks. 

—  Control  over  the  vertical  placement  of  phrases. 
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Section  4.8  The  Visual  Parsing  of  Statements 

Another  vita!  carrier  of  the  meaning  of  a  program  is  the  syntactic 
structure  of  program  statements.  Statements  within  a  typical  C 
program  may  nest  recursively.  At  any  level,  statements  such  as 
the  if,  do. ..while,  and  switch  contain  several  component  expres¬ 
sions  or  statements  that  must  be  parsed  and  understood  in  order  that 
the  statement  as  a  whole  may  be  understood.  The  resulting  confi¬ 
guration  of  separate  and  nested  statements  presents  a  challenge  to 
effective  spatial  structuring. 

We  considered  or  experimented  with  various  methods  of  applying 
visible  language  attributes  to  enhance  a  reader’s  ability  to  parse 
complex  C  statements.  These  attributes  included: 

—  The  amount  of  indentation  used  in  visually  encoding  the  nesting 
of  phrases  within  statements,  for  example,  1,  2  or  3  picas  for 
each  level  of  indentation. 

—  If  there  are  more  than  3  or  4  levels  of  indentation,  clustering  of  3 
or  4  adjacent  levels  into  groups,  distinguishing  the  groups  by 
larger  indentations,  rules,  leader  lines,  gray  screen  tints,  or  other 
visual  devices.  The  indentation  of  a  group  could  be,  for 
example,  8,  10,  or  12  picas  from  the  left  margin  of  the  preceding 
group. 

—  The  horizontal  position  of  a  left  brace,  e.g.,  all  the  way  to  the 
left,  hierarchically  aligned  with  the  text  on  the  "current  line",  at 
the  end  of  the  text  on  the  “previous  line”,  and  all  the  way  to  the 
right.  In  the  cases  of  positioning  braces  in  a  channel  of  their 
own  to  the  left  or  the  right,  the  braces  can  be  indented  within 
the  channel  various  amounts  to  encode  the  hierarchy  level. 

—  The  vertical  position  of  the  left  brace,  e.g.,  the  “previous  line", 
between  the  previous  line  and  the  “current  line",  or  the  current 
line. 

—  The  horizontal  position  of  a  right  brace,  e.g.,  all  the  way  to  the 
left,  at  the  end  of  the  text  on  the  “current  line",  and  all  the  way 
to  the  right.  In  the  cases  of  positioning  braces  in  a  channel  of 
their  own  to  the  left  or  the  right,  the  braces  can  be  indented 
within  the  channel  various  amounts  to  encode  the  hierarchy 
level. 

—  The  vertical  position  of  the  right  brace,  e.g.,  the  "current  line", 
between  the  current  line  and  the  "next  line",  or  the  next  line. 

—  Removal  of  braces  altogether,  thereby  relying  upon  precise 
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indentation  only  to  encode  visual  hierarchy.  Alternatively, 
replacement  of  braces  with  a  new  diagrammatic  notation  using 
arrows,  pointing  symbols,  nested  brackets,  parallel  vertical 
lines,  or  channels  of  varying  gray  value. 

—  Suppression  of  line  breaks  normally  introduced  where  state¬ 
ments  are  very  short. 

—  Placement  of  line  breaks  according  to  various  rules  and  heuris¬ 
tics,  for  example,  where  the  line  “runs  off  the  edge”,  before  or 
after  an  operator  of  low  precedence  such  as  “II”  or  or  such 
as  to  create  a  set  of  “similar”  lines. 

—  The  amount  of  indentation  used  after  a  line  break,  in  various 
increments  finer  than  the  amount  of  indentation  used  to  encode 
new  levels. 

—  The  amount  of  line  spacing  used  between  segments  of  a  broken 
line,  starting  with  the  standard  line  spacing  and  decreasing  it 
slightly  by  one  or  two  points. 

—  The  use  of  various  diagrammatic  notations  to  indicate  continu¬ 
ity  writh  segments  of  a  broken  line,  such  as  arrows,  ellipses,  or 
regions  of  gray  value. 

—  The  use  of  various  diagrammatic  notations  such  as  pointing  fig¬ 
ures  to  indicate  “unusual”  control  constructs.  A  definition  of 
this  concept  for  C  might  be  any  label,  any  goto  statement,  any 
continue  statement,  any  break  statement  not  at  the  end  of  a 
case,  any  statement  ending  a  case  that  is  not  a  break  state¬ 
ment,  and  any  return  statement  not  at  the  end  of  a  function  defi¬ 
nition. 


Program  Visualization  Project  Final  Report  Chapter  4  Section  4.9:  Page  49 

Human  Computing  Resources  Theory.  Results.  Graphic  Design  of  C  The  Presentation  of 

Aaron  Marcus  and  Associates  Conclusions  Source  Code  and  Function  Definitions 

Comments 


Section  4.9  The  Presentation  of  Function  Definitions _ 

We  also  had  to  develop  mechanisms  to  highlight  the  program’s 
constituent  structure  in  terms  of  its  internally  defined  functions. 
The  presence  of  functions  help  determine  for  the  reader  the  gen¬ 
eral  sequence  and  rationale  for  the  program’s  structure.  Making 
these  major  “chunks”  of  the  program  immediately  accessible  can 
contribute  significantly  to  the  program’s  readability.  We  consid¬ 
ered  or  experimented  with  the  following  techniques: 

—  Use  of  pagination  to  minimize  the  splitting  of  function  defini¬ 
tions  across  page  boundaries  in  ways  that  result  in  placing 
most  of  the  text  on  one  page  and  only  a  few  lines  on  a  subse¬ 
quent  page. 

—  Use  of  rules  of  varying  weights  under  the  declaration  of  the 


function  name  and  formal  parameter  list. 

—  Use  of  rules  of  varying  weights  under  the  last  declaration  of  a 
formal  parameter. 

—  Use  of  headlines  for  the  declaration  of  the  function  name  and 
formal  parameter  list. 

—  Placement  of  the  type  of  the  value  returned  by  the  function,  if 
any,  on  a  line  separate  from  the  function  name  and  formal 
parameter  list. 
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Section  4.10 

The  Presentation  of  Program  Structure 

A  C  program  consists  of  one  or  more  C  source  files.  Each  source 
file  contains  a  portion  of  the  entire  C  program,  some  number  of 
top-level-declarations.  These  top-level-declarations  are  either  dec¬ 
larations  of  identifiers  used  in  the  program  or  function  definitions 
elaborating  the  meaning  of  new  C  procedural  constructs  called 
functions  by  defining  them  in  terms  of  existing  C  constructs. 


SEE,  the  visual  C  compiler,  produces  a  listing  of  a  file  with  respect 
to  a  set  of  included  external  files  binding  the  external  references. 
These  included  header  files  typically  contain  declarations  of  identif 
iers,  functions,  manifest  constants,  and  new  defined  types.  The 
declared  functions  are  often  defined  in  “standard  libraries"  which 
are  stored  on  the  system  and  which  contain  functions  generally  use¬ 
ful  to  all  C  programmers. 

We  considered  or  experimented  with  the  following  techniques: 

—  Highlighting  the  global  variables  by  a  variety  of  typographic 
methods  as  in  Section  4.4. 

—  The  use  of  a  novel  mechanism  to  aid  the  reading  of  complex  pro¬ 
grams  structured  as  a  collection  of  files  by  adding  to  each  pro¬ 
gram  page  footnotes  that  contain  cross-references  indicating 
where  in  an  included  file  an  external  identifier  is  defined  and 
where  each  identifier  defined  on  a  page  is  used.  This  produces, 
in  essence,  a  cross-reference  listing  distributed  throughout  the 
entire  program  on  pages  where  it  is  relevant. 
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Chapter  5  _  Conclusions _ 

The  previous  chapters  have  presented  a  classification  of  issues 
affecting  program  legibility  and  readability.  We  have  seen  that 
there  are  complex  interactions  of  visible  language  attributes  both 
among  themselves  and  in  relation  to  the  C  programming  lan¬ 
guage.  Despite  this,  the  task  of  developing  a  recommended  form 
has  proven  to  be  tractable,  and  we  have  been  able  to  do  many 
experimental  variations  before  suggesting  an  optimum  appear¬ 
ance. 

Based  on  our  work,  we  believe  that  a  comprehensive,  consistent, 
and  effective  presentation  of  a  graphic  design  schema  for  the 
appearance  of  C  is  desirable  to  improve  program  legibility  and 
readability,  that  we  have  demonstrated  the  feasibility  of  develop¬ 
ing  such  a  schema,  and  that  a  graphic  design  manual  for  the  visi¬ 
ble  language  characteristics  is  an  appropriate  vehicle  in  which  to 
present  the  resulting  recommended  conventions.  As,  more  pro¬ 
grammers  use  the  conventions,  as  they  are  refined  and  improved 
through  this  use,  and  as  more  human  factors  knowledge  about 
program  literature  becomes  available,  the  conventions  will  mature 
into  effective  standards. 

In  achieving  this  set  of  objectives,  we  have  also  encountered 
many  unforeseen  conceptual  and  technical  difficulties.  When  we 
began  our  project,  we  originally  desired  a  solution  for  the  general 
problem  of  typographic  and  non-typographic  representation  of 
programming  languages  for  formats  that  were  both  static  and 
those  that  were  dynamic  i.e.,  in  an  interactive  environment.  We 
soon  realized  that  even  the  more  restricted  problem  of  determin¬ 
ing  static,  typographic  representations  was  a  challenge.  At  the 
time,  a  wide  variety  of  laser  printer  fonts  of  high  quality  was  not 
readily  available,  and  it  was  difficult  to  create  even  manually 
composed  pages.  We  have  also  had  to  combat  a  great  deal  of 
additional  recalcitrant  technology  (see  Chapter  6). 

The  approach  and  many  of  the  concrete  recommendations  for  C 
can  be  transferred  to  other  languages,  such  as  Pascal  and  Ada. 

We  must  advise  those  attempting  such  designs,  however,  that  the 
task  will  require  extremely  careful  attention  to  each  language’s 
unique  characteristics.  By  studying  these  characteristics,  it  will 
be  possible  to  design  effective  visualizations  that  take  advantage 
of  visible  language  and  of  the  computer  language’s  full  potential. 
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One  of  the  primary  difficulties  encountered  in  making  graphic 
design  evaluations  is  that  our  knowledge  of  detailed  reading 
motivations  and  strategies  in  programmers  is  limited  (see  Chapter 
6).  As  a  result,  it  is  not  yet  possible  to  base  decisions  among 
approximately  equivalent  appearances  on  any  scientific  criteria. 
Nevertheless,  we  believe  that  our  general  methodology  is  sound, 
and  that  our  results  are  significant  improvements. 

Were  we  to  have  merely  designed  unique  prototypes  for  improve¬ 
ment,  this  would  have  had  some  value.  However,  we  have  gone 
beyond  this  to  provide  a  tool  for  generating  automatically 
improved  appearance  for  most  C  programs.  In  addition,  because 
it  is  likely  that  our  conventions  will  change  over  the  coming 
years,  we  have  also  provided  a  flexible  tool  for  editing  and  refin¬ 
ing  the  appearance  of  these  automatically  produced  program 
visualizations.  Our  SEE  compiler  is  one  of  the  most  elaborately 
tunable  visible  language  processing  engines  available,  building  as  it 
does  both  upon  the  technology  of  the  Portable  C  Compiler  [Johnson, 
1979]  and  upon  all  of  TROFF’s  text  manipulation  capabilities.  We 
have  pushed  these  tools  as  far  as  they  can  go  in  directions  for  which 
they  were  never  intended.  Future  developers  will  therefore  need  to 
provide  SEE’s  functionality  (see  Volume  6)  in  a  far  more  appropri¬ 
ate  and  robust  implementation  than  our  prototype. 

Thus  our  approach  and  our  accomplishment  have  been  to  design 
both  the  best  possible  appearance  for  the  C  programming  language 
within  technical  and  time  constraints  as  well  as  a  suitable  prototype 
of  an  effective  tool  for  automating,  editing,  and  refining  this 
appearance. 

The  details  of  our  future  research  directions  are  detailed  in  the  next 
chapter. 
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Chapter  6  Future  Research 


Program  Visualization  Algorithms 

There  are  a  number  of  area  fundamental  to  the  enhanced  presenta¬ 
tion  of  source  text  that  we  have  not  yet  automated.  These  are  the 
automatic  introduction  of  white  space,  appropriate  automatic  line 
breaking,  appropriate  automatic  page  breaking,  incorporation  of 
programmer  formatting  intentions,  display  of  pragmatics,  display  of 
diagrammatic  representations,  and  comprehensive  automatic  warn¬ 
ings  and  annotations. 

Good  programmers  add  blank  lines  (white  space)  to  enhance  the 
readability  of  their  code.  A  program  visualizer  must  do  this  auto¬ 
matically  and  correctly.  An  effective  algorithm  will  note  the  tran¬ 
sitions  between  different  kinds  of  program  source  text,  classifying 
each  line  as  a  comment,  a  preprocessor  command,  a  component  of  a 
function  header,  a  statement  within  a  function  body,  a  component 
of  a  type  definition,  and  a  component  of  any  other  kind  of  declara¬ 
tion.  It  will  then  introduce  white  space  between  a  line  of  one  kind 
and  a  line  of  another  kind.  Exactly  how  much  space  should  be 
introduced  for  each  kind  of  transition,  as  well  as  the  special  cases 
not  handled  by  this  simple  procedure,  must  be  a  subject  for  future 
research. 

No  matter  how  much  space  exists  for  a  line  on  a  page,  some  pro¬ 
grammers  will  write  some  statements  that  will  need  to  be  “broken" 
and  wrapped  to  the  next  line.  The  result  is  of  course  ugly  (see  Fig¬ 
ure  5  of  Volume  3),  but  an  appropriate  line  breaking  algorithm. can 
minimize  the  visual  chaos  and  damage  that  results.  An  effective 
algorithm  will  scan  backwards  from  the  point  representing  the  most 
text  that  will  fit  on  the  line,  will  examine  the  precedence  of  the 
operators  that  precede  that  point,  and  will  try  to  find  an  operator  of 
“relatively  low”  precedence  that  is  not  “too  far"  from  that  point  as 
the  place  at  which  to  make  the  break.  The  algorithm  will  be  com¬ 
plicated  by  the  occurrence  of  long  string  constants  and  will  have 
particular  difficulty  with  lines  that  begin  very  deeply  indented. 

Automatic  page  breaking  and  pagination  is  an  even  more  difficult 
problem.  An  implementation  problem  with  the  current  generation 
of  text  formatters  (see  below)  is  the  need  for  a  great  deal  of  look¬ 
ahead  in  order  to  do  the  page  breaking  properly.  There  are  also 
severe  conceptual  problems.  The  basic  idea  is  that  there  should  ide¬ 
ally  never  be  less  than  three  lines  in  a  related  “group"  of  statements 
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at  the  top  or  the  bottom  of  the  page.  The  notion  of  a  group  here  is 
related  to  the  concept  of  the  “kind”  of  source  t  t  line  defined 
two  paragraphs  above.  The  algorithm  becomes  difficult  because  it 
is  not  always  possible  to  fulfill  this  condition,  because  we  want  to 
break  the  page  at  a  point  that  is  as  shallowly  nested  as  possible, 
because  we  want  to  avoid  separating  an  external  or  internal  com¬ 
ment  from  the  code  following  it  to  which  it  typically  refers,  and 
because  we  want  at  almost  any  cost  to  avoid  breaking  in  places 
such  as  in  the  middle  of  a  function  header,  a  typedef  definition, 
or  a  structure  definition. 

An  alternate  approach  to  the  optimization  of  line  breaking  and 
page  breaking  and  to  the  very  difficult  unsolved  problem  of  the 
effective  display  of  initializers  is  the  incorporation  of  programmer 
formatting  intentions.  In  other  words,  the  visualizer  should  heed 
the  directions  of  the  programmer  when  she  inserts  carriage  returns 
in  the  middle  of  statements,  extra  carriage  returns  between  state¬ 
ments  or  function  definitions,  and  tabs  or  carriage  returns  in  the 
middle  of  expressions  or  initializers.  How  to  reconcile  these  speci¬ 
fications  with  the  default  automated  decisions  of  the  visualizer  is  a 
subject  for  future  research. 

Another  important  topic  is  the  display  of  pragmatics,  features  of  the 
code  in  use.  A  good  example  is  the  need  to  know  what  code  has 
changed  since  the  last  version.  An  effective  algorithm  may  employ 
conventions  such  as  the  use  of  a  new  font  or  a  gray  background  to 
highlight  code  that  has  been  added,  and  a  diagrammatic  convention 
such  as  a  strike-through  line  to  show  where  code  has  been  deleted 
and  what  has  been  removed. 

We  have  in  our  work  not  yet  touched  on  the  possibilities  for  and 
the  problems  in  the  automatic  generation  of  effective  diagrammatic 
representations.  There  is  a  rich  variety  of  techniques  to  be  consid¬ 
ered  (see,  for  example,  [Martin  &  McClure,  1985]).  Future 
research  is  required  to  select  the  most  valuable  representations,  and 
to  devise  algorithms  for  automatic  conversion  between  source  code 
and  diagram. 

Finally,  the  introduction  of  fingers  pointing  at  “abnormal”  control 
flow  illustrates  the  need  to  develop  mechanisms  for  the  automatic 
addition  of  warnings  and  annotations.  Other  examples  are  the  con¬ 
ditions  currently  detected  by  the  I  INI  program  [Johnson,  1978], 
These  include  unusued  variables  and  functions,  variables  used 
before  they  are  set,  unreachable  parts  of  the  program,  and 
mismatches  between  function  declarations  and  uses  in  terms  of  the 


Program  Visualization  Project 
Human  Computing  Resources 
Aaron  Marcus  and  Associates 


Final  Report' 
Theory.  Results, 
Conclusions 


Chapter  6: 
Future  Research 


Page  55 


the  number  and  types  of  arguments.  Researchers  in  automatic 
programming  will  be  able  to  propose  far  more  substantive  ways  in 
which  a  programmer's  assistant  can  detect  features  of  a  program 
and  write  its  suggestions  on  the  listing  for  consideration  by  the  pro¬ 
grammer. 

Visualization  of  other  Programming  Languages 

Our  work  needs  to  be  extended  to  programming  languages  other 
than  C. 

The  extension  to  other  ALGOL-like  languages,  e.g.,  PASCAL  and 
ADA,  will  be  straightforward.  The  most  significant  area  where 
some  conceptual  work  may  need  to  be  done  could  be  in  the  effec¬ 
tive  representation  of  multi-tasking  in  ada. 

Languages  for  artificial  intelligence  work,  e.g.,  LISP,  PROLOG,  and 
SMALLTALK,  may  present  a  greater  challenge.  Designers  will  have 
to  combat  the  sea  of  parentheses  presented  by  LISP  and  will  need  to 
consider  the  rich  data  structures  and  control  flow  mechanisms 
either  directly  present  in  these  languages  or  available  through  their 
many  extensions. 

Interactive  Enhancements  of  Source  Text 

Even  more  interesting  is  the  extension  of  this  work  to  the  interac¬ 
tive  display  and  manipulation  of  program  source  text. 

One  immediate  problem  that  must  be  faced  is  the  lower  resolution 
(typically,  no  more  than  100  dots  per  inch)  of  these  devices.  This 
may  require  modification  of  many  of  the  techniques  that  employ  a 
variety  of  fonts,  styles,  and  sizes  and  that  employ  rules  and  other 
diagrammatic  devices. 

On  the  positive  side,  interactive  program  visualization  offers  a  host 
of  new  opportunities  to  incorporate  dynamics,  animation,  color,  and 
sound.  We  are  no  longer  faced  with  the  difficult  problem  of  estab¬ 
lishing  “the  best"  mapping  between  token  types  and  typographic 
styles,  for  the  program  can  be  easily  re-displayed  with  different  set¬ 
tings.  Even  more  significantly,  we  can  depict  through  image 
dynamics  and  through  animation  features  of  the  program  in  execu 
tion.  This  is,  quite  literally,  an  entire  new  dimension  of  program 
visualization. 
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Implementation  of  Program  Visualization  Processors 

As  we  have  intimated  above,  there  are  a  great  many  problems 
remaining  to  be  solved  before  a  system  such  as  see  can  be  imple¬ 
mented  with  ease. 

As  is  explained  in  more  detail  in  Volume  6,  SEE  was  implemented 
by  making  modifications  and  extensions  to  the  Portable  C  Com¬ 
piler.  This  did  not  result  in  an  appropriate  and  robust  implementa¬ 
tion.  Visual  compiling  is  a  very  different  problem  from  standard 
compilation,  even  though  it  shares  common  elements  such  as  the 
need  to  do  lexical  analysis  and  the  need  to  do  parsing.  Future 
investigators  must  therefore  develop  an  appropriate  and  effective 
visual  compiler  technology. 

We  have  also  been  handcuffed  by  the  lack  of  an  appropriate  docu¬ 
ment  formatting  technology.  The  nature  of  TROFF’s  processing  of 
text  makes  formatting  that  requires  look-ahead,  such  as  line  break¬ 
ing  and  page  breaking,  very  difficult.  Standard  TROFF,  despite  the 
fact  that  it  is  supposed  to  be  “<  ?vice-independent”,  is  very  difficult 
to  port  to  new  hardware  and  to  new  fonts.  It  is  also  impossible  to 
do  conversational,  interactive  document  formatting  with  TROFF;  all 
text  must  be  processed  from  the  very  beginning  of  the  document. 

To  build  the  most  effective  program  visualization  aids,  we  require 
that  research  be  done  on  all  three  of  these  problems. 

As  we  have  indicated,  program  visualization  requires  fonts  chosen 
with  great  care  and  attention  to  the  fine  detail  that  occurs  in  com¬ 
puter  program  source  text.  The  design  of  fonts  that  are  optimal  for 
the  display  of  computer  programs  rather  than  English  prose  is  there¬ 
fore  another  task  for  future  research. 

Finally,  the  design  and  implementation  of  interactive  visuaiizers 
will  raise  an  entirely  new  set  of  issues  that  go  beyond  those  encoun¬ 
tered  in  this  work. 

The  Human  Factors  of  Program  Reading 

There  also  remains  a  broad  body  of  concerns  and  questions  that 
relate  to  the  need  to  substantiate  experimentally  that  the  methods 
of  presentation  we  propose  are  effective  in  making  programs  more 
legible,  readable,  intelligible,  memorable,  and  maintainable. 

We  must  begin  with  an  investigation  into  how  programmers  read,  a 
characterization  of  the  cognitive  and  perceptual  processes  that  com¬ 
prise  the  task.  An  information  processing  model  of  program  reading 
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would  greatly  assist  the  design  of  methods  of  presentation  that 
facilitate  the  act  of  reading. 

We  must  then  try  to  measure  if  our  display  conventions  make  pro¬ 
grams  more  legibile,  readable,  intelligible,  memorable,  and  main¬ 
tainable,  and,  if  so,  by  how  much  are  these  measures  improved? 

Finally,  we  must  investigate  in  what  ways  our  methods  of  presen¬ 
tation  are  better.  What  aspects  of  our  conventions  are  helpful, 
which  are  harmful,  and  why? 
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MISSION 

of 

Rome  Air  Development  Center 


RAPC  plans  and  executes  -tea e a* c ft,  de.ve.lopme.nt,  test 
and  &e.lee.ted  acquisition  programs  In  su pport  oh 
Command,  Control,  Communications  and  Intelligence 
( C 3 1 J  activities .  Technical  and  engineering 
support  within  areas  oh  competence  is  provided  to 
ESV  Program  O^ices  IPOs )  and  other  ESP  elements 
to  perform  elective  acquisition  oh  C^I  systems. 

The  areas  oh  technical  competence  include 
communications ,  command  and  control,  battle 
management,  inhormation  processing,  surveillance 
sensors,  intelligence  data  collection  and  handling, 
solid  state  sciences ,  electromagnetics ,  and 
propagation,  and  electronic,  maintainability , 
and  compatibility. 
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